Piecewise Synonyms for Enhanced UMLS Source Terminology Integration

نویسندگان

  • Kuo-Chuan Huang
  • James Geller
  • Michael Halper
  • James J. Cimino
چکیده

The UMLS contains more than 100 source vocabularies and is growing via the integration of others. When integrating a new source, the source terms already in the UMLS must first be found. The easiest approach to this is simple string matching. However, string matching usually does not find all concepts that should be found. A new methodology, based on the notion of piecewise synonyms, for enhancing the process of concept discovery in the UMLS is presented. This methodology is supported by first creating a general synonym dictionary based on the UMLS. Each multi-word source term is decomposed into its component words, allowing for the generation of separate synonyms for each word from the general synonym dictionary. The recombination of these synonyms into new terms creates an expanded pool of matching candidates for terms from the source. The methodology is demonstrated with respect to an existing UMLS source. It shows a 34% improvement over simple string matching.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalability of Piecewise Synonym Identification in Integration of SNOMED into the UMLS

Synonym identification during source terminology integration into the Unified Medical Language System (UMLS) is a labor-intensive task needed for every new release of the source. The piecewise synonym (PWS) methodology was previously used for the integration of a small source. The goal of this paper is to determine whether the piecewise synonym methodology with two control parameters scales to ...

متن کامل

Using WordNet synonym substitution to enhance UMLS source integration

OBJECTIVE Synonym-substitution algorithms have been developed for the purpose of matching source vocabulary terms with existing Unified Medical Language System (UMLS) terms during the integration process. A drawback is the possible explosion in the number of newly generated (potential) synonyms, which can tax computational and expert review resources. Experiments are run using a synonym-substit...

متن کامل

Source authenticity in the UMLS - A case study of the Minimal Standard Terminology

As the UMLS integrates multiple source vocabularies, the integration process requires that certain adaptation be applied to the source. Our interest is in examining the relationship between the UMLS representation of a source vocabulary and the source vocabulary itself. We investigated the integration of the Minimal Standard Terminology (MST) into the UMLS in order to examine how close its UMLS...

متن کامل

Enhanced LexSynonym Acquisition for Effective UMLS Concept Mapping

Concept mapping is important in natural language processing (NLP) for bioinformatics. The UMLS Metathesaurus provides a rich synonym thesaurus and is a popular resource for concept mapping. Query expansion using synonyms for subterm substitutions is an effective technique to increase recall for UMLS concept mapping. Synonyms used to substitute subterms are called element synonyms. The completen...

متن کامل

Using WordNet to Improve the Mapping of Data Elements to UMLS for Data Sources Integration

Each biomedical system has its own way of naming the pieces of information it contains, i.e., of defining its data elements (DEs). Integrating DEs facilitates the integration of biomedical resources. However, the mapping of DEs to the UMLS is ambiguous in many cases, when any correspondence is found at all. We propose to evaluate the potential contribution of a more general terminology: WordNet...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 2007